TIME SERIES ANALYSIS SEMANTIC DATA CUBE
UNIVERSITY OF SALZBURG
DEPARTMENT OF GEOINFORMATICS
Prof. Stefan Lang
NAME: Tanya Singh
MATRICULATION NO.: 12031084
PROGRAMME: Copernicus Master in Digital Earth
COURSE: Analysis and Modelling
EMAIL: tanya.singh@stud.sbg.ac.at
DATE OF SUBMISSION: 31.07.2021
2
Table of Contents
Table of Figures ......................................................................................................................... 3
Abstract ...................................................................................................................................... 4
1. Introduction ............................................................................................................................ 4
2. Sentinel-2 Semantic Data and Information Cube Austria - Sen2Cube.at .............................. 6
2.1 Overview .......................................................................................................................... 6
2.2 Use Cases ......................................................................................................................... 8
2.2.1 Semantic Content-Based Image Retrieval (SCBIR) .................................................. 8
2.2.2 Cloud-free Composite................................................................................................ 9
2.3 The application: demo.sen2cube.at .................................................................................. 9
2.3.1 Overview ................................................................................................................... 9
2.3.2 Application Components ......................................................................................... 12
2.3.3 Model Structure ....................................................................................................... 13
3. Study Area ........................................................................................................................... 14
3.1 Pasterzenboden ............................................................................................................... 14
3.2 Hallstätter Glacier .......................................................................................................... 14
4. The Model ............................................................................................................................ 16
5. Results .................................................................................................................................. 17
5.1 Monthly changes in snow cover percentage in Pasterzenboden .................................... 17
5.2 Monthly changes in snow cover percentage in Hallstätter ............................................. 18
6. Discussion ............................................................................................................................ 19
7. Conclusion ........................................................................................................................... 20
8. Acknowledgements .............................................................................................................. 20
References ................................................................................................................................ 21
3
Table of Figures
Figure 1. Schematic illustration of a semantic EO data cube [1]............................................... 5
Figure 2. Sentinel-2 true color image from 24 June 2020 [14] .................................................. 7
Figure 3. SIAM enrichment of the Sentinel-2 data from 24 June 2020 [14] ............................. 7
Figure 4. The AOI for which cloud free images are required [13]. ........................................... 8
Figure 5. SCBIR implemented in Sen2Cube.at [13].................................................................. 8
Figure 6. Cloud free Vienna [13] ............................................................................................... 9
Figure 7. General overview of the interface [13] ..................................................................... 10
Figure 8. Austrian Factbase [13] .............................................................................................. 10
Figure 9. North-Western Syria Factbase [13] .......................................................................... 11
Figure 10. SemantiX [13] ........................................................................................................ 11
Figure 11. Application Components [13] ................................................................................ 12
Figure 12. Pasterzenboden ....................................................................................................... 14
Figure 13. Hallstater................................................................................................................. 15
Figure 14. Study area polygons in Sen2Cube.at ...................................................................... 15
Figure 15. Snow cover Model .................................................................................................. 16
Figure 16. Yearly changes in snow cover for Pasterzenboden ................................................ 19
Figure 17. Yearly changes in snow cover for Hallstater .......................................................... 19
4
Abstract
Despite the fact that there is an increasing amount of free and open Earth observation (EO)
data, additional information is not necessarily being generated at the same rate. Because
numerical, sensory data lack semantic meaning, the fundamental issue in the massive EO
analysis sector is producing information from EO data. Semantic data cube is an
advancement of the state-of-the-art EO data cube with each observation having a categorical
information linked to it which can be queried for analysis. Big EO data expert systems are
built on the foundation of semantic EO data cubes which can automatically deduce new
information from human-understandable semantic queries.
1. Introduction
A data cube can be broadly defined as a multidimensional array to organize data which is
typically organized in three-dimension, latitude, longitude, and space [1] and it simplifies
data storage, access and analysis compared to a file-based system. The concept of a data cube
has been quite prevalent over the past few years. For example, Australia was the first country
to establish a national scale EO data cube [2], whose technology is now the foundation of
Digital Earth Australia [3] and the Open Data Cube (ODC) [4]. Furthermore, ODC
technology is also behind other operational EO data cubes, such as in Columbia [5],
Switzerland [6], Vietnam [7], The African Regional Data Cube [8] and at least nine other
national or regional initiatives still in progress [4]. Developed in the mid-1990s, Rasdaman
[9], an array database system, is the leading technology behind initiatives such as EarthServer
[10] and the Copernicus Data and Exploitation platform for Germany (CODE-DE) [11].
These State-of-the-art EO data cubes simplify data provided to users by facilitating data
uptake and deliver analysis-ready data (ARD) [1]. ARD is the calibrated data; thus, it shifts
the burden of preprocessing the data from the users to the data providers, who are better
equipped to reliably process the Big Data [3].
Although, data cubes make data access effective and efficient by providing the users with
data personalized more specifically to their needs [12], however, the biggest challenge in EO
data analysis is to extract information from the vast volume of sensory, numerical data [1].
The EO data cubes lack semantics and therefore a semantic EO data cube is an advancement
of the state-of-the-art data cube facilitating production of knowledge from sensory data. A
Semantic EO data cube can be defined as a spatio-temporal data cube containing EO data,
5
where for each observation at least one nominal (i.e., categorical) interpretation is available
and can be queried in the same instance [1]. Certain semantic content-based queries
covering a user-defined area of interest (AOI) in a given temporal extent are possible with
semantic enrichment that includes clouds, vegetation, water, and “other” categories, such as
for the most recent observations excluding clouds (e.g., user-defined cloud-free mosaic), or
an observed moment in time with the maximum vegetation extent [1]. Fig 1, illustrates the
structure of a semantic data cube wherein each image has a categorical interpretation attached
to it and as an example it can be used to retrieve images with cloud cover <10% and snow
cover <15% in a user defined AOI over a period of time (T1, T2, T3 or T4).
Figure 1. Schematic illustration of a semantic EO data cube [1].
This paper focusses on discussing the interface and components of the Austrian Semantic
Data cube, Sen2Cube and illustrates a use case of time series analysis of snow cover change
over a period of 4 years, in two AOI in Salzburg, Austria.
6
2. Sentinel-2 Semantic Data and Information Cube Austria - Sen2Cube.at
2.1 Overview
Sen2Cube.at is the world's first operational semantic Earth Observation data cube covering
all of Austria and includes all available Sentinel-2 images since Sentinel-2A was launched in
2015, allowing the derivation of higher-level information from satellite images [13] . That
implies obtaining answers to questions like "Which images in a certain AOI are cloud-free?",
What was the phenology of this agricultural field in 2019? and others. The user doesn’t
need to have knowledge of programming or have a detailed understanding of the raw image
data to get these results [13]. It is a user-friendly, web-based semantic image querying
system that allows users to create queries simply by combining building components [13].
Within the semantic querying process, each block represents a distinct, well defined task or
value [13]. The Satellite Image Automatic Mapper SIAM decision tree software has been
used to semantically enrich the EO data in the Sen2Cube.at system [14]. SIAM
automatically categorizes EO data that is calibrated to at least top-of-atmosphere (TOA)
reflectance from several optical sensors (e.g., Sentinel-2, Landsat-8, AVHRR, VHR) using a
per-pixel physical spectral model-based decision tree [1]. SIAM is deemed automatic
because it does not rely on a priori knowledge and runs without any user-defined
parameterization or training data [14]. Fig. 2 and 3 shows how SIAM semantically enriches
the EO data by categorization the entire AOI.
7
Figure 2. Sentinel-2 true color image from 24 June 2020 [14]
Figure 3. SIAM enrichment of the Sentinel-2 data from 24 June 2020 [14]
8
2.2 Use Cases
Here are discussed some of the selective use cases of the Sen2Cube.at to understand the wide
usage of the application.
2.2.1 Semantic Content-Based Image Retrieval (SCBIR)
The Sen2Cube system can be used to query images based on their content, like, search for
cloud-free images in an area. Fig. 4 shows the AOI which is cloud contaminated for which
the cloud filter in the Sen2Cube application can be used to retrieve cloud-free images.
Subsequently, it provides better information about which images can be used or not for the
application.
Figure 4. The AOI for which cloud free images are required [13].
Figure 5. SCBIR implemented in Sen2Cube.at [13]
9
2.2.2 Cloud-free Composite
A cloud-free composite means creating an artificial image by stitching together cloud-
free/best pixels using some user-defined rules in the considered area [13]. For example, Fig.
6 is the cloud-free composite of Vienna in Summer 2018.
Figure 6. Cloud free Vienna [13]
Other than this the system can be used for the analysis of spatially disaggregated information
like water bodies and urban green areas. Also, for time series analysis which will be
discussed in detail in the following chapters.
2.3 The application: demo.sen2cube.at
2.3.1 Overview
Fig. 7 shows the interface of the sen2cube.at, which has 3 factbases, namely, Austrian, North-
Western Syria and SemantiX.
10
Figure 7. General overview of the interface [13]
Figure 8. Austrian Factbase [13]
11
Figure 9. North-Western Syria Factbase [13]
Figure 10. SemantiX [13]
The Austrian Factbase (Fig.8) and North-western Syria Factbase (Fig.9) uses the sentinel 2
sensor and has data available from 2015. SemantiX (Fig.10) is a research project that uses
Copernicus Sentinel-3 A/B imagery to establish, augment, and enhance Advanced Very High
Resolution Radiometer (AVHRR) time series, making them and derived essential climate
variables accessible through a semantic Earth observation (EO) data cube [15].
12
2.3.2 Application Components
When employing an expert system (information processing system), inferring information
from data is often accomplished by combining a knowledgebase with a factbase, which
allows the inference engine to analyze enormous volumes of data automatically [13].
1. Knowledgebase: The knowledgebase contains rulesets and models that express general or
particular knowledge of the world [13]. As a result of using the system, the knowledgebase is
gradually expanded by adding new models to it.
2. Factbase: It consists of the raw image data and semantically enriched data, and the area of
interest can be manually drawn as point, lines or polygons, or can be uploaded as a GeoJSON
dataset [13].
3. Inference: The inference engine executes the model created in knowledgebase on the AOI
selected in the Factbase to produce the results. The results can be visualized as a graph,
especially in the case of time series analysis or can be downloaded as a CSV file.
Figure 11. Application Components [13]
13
2.3.3 Model Structure
The model is defined by building blocks and each block represents a discrete, clearly defined
task or value inside the semantic querying process [13]. These blocks are segregated into
different classes and some of them are discussed here.
1. Definitions Entities: These are the phenomenon or real-world objects of interest, for
example a field or a lake, for which the analysis is being done [13]. In the spatio-temporal
scope of the semantic query, entity definitions define whether each observation is identified
as being part of this type of entity or not [13]. For example, for the entity water, the entity
definition will store boolean true when water is identified and boolean false when it is not
present and these observations are aligned along two spatial dimensions and one temporal
dimension, making the data structure a cube [13].
2. Definitions Properties: These explain the characteristics of the entity. For example, blue
colour is property of water.
3. Data and Information Appearance: The appearance block contains information about
how an Earth observation measurement of the Earth's surface appears based on what it
represents [13]. Calculated indices (e.g., multi-spectral greenness indices, multi-spectral
brightness values) or categories (e.g., multi-spectral color spaces/categories) with some level
of semantic linkage can be used [13].
4. Data and Information Atmosphere: All interpretations of measurements or
supplementary information associated with atmospheric phenomena or image affects (i.e., not
the Earth's surface) are included in the atmosphere block [13]. Estimates of haze, cloud
cover, fog, airborne particulates, and other factors that obstruct Earth surface applications or
are of particular importance to atmospheric applications are included [13].
5. Verbs: Verbs are the fundamental processing units. Each verb denotes a distinct action that
can be performed on a data cube. As a result, each verb block is labelled with a single action
word (i.e., a verb) that explains the process. A with-do structure is used to express this in
terms of blocks, with a cube referenced in the with portion and the action to be taken to it in
the do section [13].
14
3. Study Area
3.1 Pasterzenboden
The Pasterze glacier, Austria's largest and longest in the Eastern Alps, is little over 8
kilometres long [16]. It is the headwaters of the Möll and is located at the foot of the
Großglockner in the topmost valley floor of the Mölltal (Pasterzenboden) [16]. Its area has
shrunk by about half since 1856, when it was more over 30 km
2
[16] .
Figure 12. Pasterzenboden
3.2 Hallstätter Glacier
The Dachstein Mountains' Hallstätter Glacier is the Dachstein Mountains' greatest glacier
[17]. It flows down to the Eissee lake below the Simony Hut at a height of 2,205 m, right
beneath the northern foot of the Dachstein itself [17]. The glacier is very sensitive to the
changes in climate and has been heavily retreating since the last 20 years [17].
15
Figure 13. Hallstater
Figure 14. Study area polygons in Sen2Cube.at
16
4. The Model
The following model produces a time series of snow cover change in the Pasterzenboden
glacier and the Hallstätter Glacier over a period of 4 years (Jan 2016 Decemeber 2020)
Figure 15. Snow cover Model
The entity SIAM_clouds denotes the category cloudidentified by the property colour’ in
the class atmosphere in the study area. This entity will act as the cloud mask which will be
applied in AOI while calculating the result. The second entity SIAM_snowdenotes the
category Snow or water ice’ identified by the property ‘colour’ in the class appearance’ in
the study area. These two entity definitions are under semantic concepts. The model
calculates the mean percentage of snow grouped by month’ in the ‘spatial feature(AOI).
The cloud mask is applied by invert’, which removes all the observations that have boolean
17
true values for category cloud and keeps only the false values in the cube. This way only the
cloud-free observations are considered while calculating the snow percentage over time.
5. Results
5.1 Monthly changes in snow cover percentage in Pasterzenboden
The results show a general trend of constant high snow cover from December to June, with a
fall in percentage from June until a deep depression in August and then again, a rise from
August to December. There is a steeper fall in snow percentage in the year 2017 from May as
compared to other years. Also, there is an anomalous sharp rise in cover from September
2020 to October 2020 followed by a sharp fall until November 2020.
18
5.2 Monthly changes in snow cover percentage in Hallstätter
Similar to Pasterzenboden, Hallstätter follows the general trend of snow cover change. The
year 2016 shows a contrasting pattern to Pasterzenboden in 2016. There is an abrupt fall in
snow percentage from August 2016 to September 2016.
To better comprehend the results, the model was re-run by grouping over the years and the
snow cover changes were analysed on a yearly basis.
19
6. Discussion
Figure 16. Yearly changes in snow cover for Pasterzenboden
Figure 17. Yearly changes in snow cover for Hallstater
80
81
82
83
84
85
86
87
88
89
90
91
2016 2017 2018 2019 2020
SNOW COVER (%)
YEARS
70
72
74
76
78
80
82
84
86
88
2016 2017 2018 2019 2020
SNOW COVER (%)
YEARS
20
The general trend shows a fall in snow cover percentage over the years in Pasterzenboden.
The inter year changes can be explained by the Global synoptic processes. In meteorology,
the synoptic scale (also known as big scale or cyclonic scale) refers to a horizontal length
scale of 1000 kilometers (620 miles) or more [18]. This corresponds to a horizontal scale
typical of depressions in the mid-latitudes (e.g., extratropical cyclones) [18]. Most high- and
low-pressure areas seen on weather maps (such as surface weather analyses) are synoptic-
scale systems [18].
However, Hallstätter shows two peaks for snow cover over the last 4 years the changes have
a negative slope. There is an anomalous rise in percentage of snow cover from 2016 to 2017
which can be explained by deeper reviewing of the meteorological conditions during this
year. Overall, the yearly changes provide a good understanding of the changes and this
methodology can be extended in the future for a larger dataset to obtain concrete and
meaningful results.
7. Conclusion
The aim of this work was to explain what a semantic EO data cube is and what image
retrieval, analysis, and information generation capabilities they provide by discussing a use
case for Sen2Cube.at. Many fields are underserved in relation to what EO may offer, despite
the fact that a large amount of EO data is collected and only a small percentage of it is
exploited to produce information. Semantic EO data cubes combine EO data with an
interpretation for each observation of a scene, allowing users to run queries on Big data and
time series that were previously impossible, as well as provide imaged-derived information
building blocks for analysis that are more meaningful than measured surface reflectance.
Sen2Cube.at is in a novice stage and there is a lot to be discovered, especially the new
addition SemantiX which opens up the opportunity to explore the AVHRR data. Personally, I
felt the interface is user-friendly and there is possibility to improve the visualization of the
results.
8. Acknowledgements
I would like to thank the Research team at Zgis, Salzburg, Austria for being supportive by
providing me the account to access the online system for my work. Heartfelt gratitude to
Larisa Paulescu and Cesar Aybar Camacho for their constant help to understand the model-
building process and functioning of the system.
21
9. References
[1]
H. Augustin, M. Sudmanns, D. Tiede, S. Lang and A. Baraldi, "Semantic Earth Observation Data
Cubes," mdpi - Data, p. 1, 2019.
[2]
A. Lewis, S. Oliver, L. Lymburner, B. Evans, L. Wyborn, N. Mueller, G. Raevksi and J. Hooke, "The
Australian Geoscience Data CubeFoundations and lessons learned.," Remote Sens. Environ,
pp. 276-292, 2017.
[3]
T. Dhu, B. Dunn, B. Lewis, L. Lymburner, N. Mueller, E. Telfer, A. Lewis, A. McIntyre and S.
Minchin, "Digital earth AustraliaUnlocking new value from earth observation data," Big Earth
Data, pp. 64-74, 2017.
[4]
B. Killough, "Overview of the Open Data Cube Initiative.," Valencia, Spain., 2018.
[5]
C. Ariza-Porras, G. Bravo, M. Villamizar, A. Moreno, H. Castro, G. Galindo, E. Cabera and S.
Valbuena, "CDCol: A Geoscience Data Cube that Meets Colombian Needs," Springer, 2017.
[6]
G. Giuliani, B. Chatenoux, A. Bono, D. Rodila, J.-P. Richard, K. Allenbach, H. Dao and P. Peduzzi,
"Building an Earth Observations Data Cube: Lessons learned from the Swiss Data Cube (SDC) on
generating Analysis Ready Data (ARD)," Big Earth Data, 2017.
[7]
T. Cottom, "An Examination of Vietnam and Space," Space Policy, 2019.
[8]
GEO, "Group on Earth Observations (GEO). Digital Earth Africa," 2019. [Online]. Available:
https://www.earthobservations.org/documents/gwp20_22/DE-AFRICA.pdf. [Accessed 2021].
[9]
P. Baumann, P. Furtado, R. Ritsch and N. Widmann, "The RasDaMan approach to
multidimensional database management," ACM Press San Jose, CA, USA,, 1997.
[10]
P. Baumann, P. Mazzetti, J. Ungar, R. Barbera, D. Barboni, A. Beccati, L. Bigagli, E. Boldrini and
R. e. Bruno, "Big Data Analytics for Earth Sciences: The EarthServer approach," International
Journal on Digital earth, 2016.
[11]
T. Storch, C. Reck, S. Holzwarth and V. Keuck, "Code-Dethe German Operational
Environment for Accessing and Processing Copernicus Sentinel Products," In Proceedings of the
2018 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2018.
[12]
M. Sudmanns, D. Tiede, S. Lang, H. Bergstedt, G. Trost, H. Augustin, A. Baraldi and T. Blaschke,
"Big Earth data: Disruptive changes in Earth observation data management and analysis?,"
International Journal Digital Earth, 2019.
[13]
D. T. U. o. S. D. o. G. . Z_GIS, "Sen2Cube.at Manual," 2021. [Online]. Available:
https://manual.sen2cube.at/. [Accessed 27 July 2021].
[14]
T. Dirke, A. baraldi, M. Sudmann and L. Stefan, "Architecture and Prototypical Implementation
of a Semantic Querying System for Big Earth Observation Image Bases.," Eur. J. Remote Sens,
2017.
22
[15]
H. Augustin, M. Sudmanns, H. B. S. Weber and D. Tiede, "SemantiX: a cross-sensor semantic EO
data cube to open and leverage AVHRR time-series and essential climate variables with
scientists and the public," EGU General Assembly, 2021.
[16]
G. Fischer, "Bergauf," alpenverein.at, 2016.
[17]
R. Moser, "Der Hallstätter Gletscher - heute der größte Gletscher der Nördlichen Kalkalpen,"
Oberösterreichische Heimatblätter, 1954.
[18]
R. H. a. J. Evans, "Synoptic Composites of the Extratropical Transition Lifecycle of North Atlantic
TCs as Defined Within Cyclone Phase Space," American Meteorologicl Society, 2003.